Materials and methods for detection and treatment of breast cancer

ABSTRACT

The invention provides a wide range of methods and compositions for detecting and treating breast cancer in an individual. Specifically, the invention provides target breast cancer-associated proteins, which permit a rapid detection, preferably before metastases occur, of breast cancer. The target breast cancer-associated protein may be detected, for example, by reacting the sample with a labeled binding moiety, for example, a labeled antibody capable of binding specifically to the protein. The invention also provides kits useful in the detection of breast cancer in an individual. In addition, the invention provides methods utilizing the breast cancer-associated proteins either as targets for treating breast cancer or as indicators for monitoring the efficacy of such a treatment.

REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Ser. No. 60/165,173, filedNov. 16, 1999; U.S. Ser. No. 60/172,170, filed Dec. 17, 1999; U.S. Ser.No. 60/178,860, filed Jan. 27, 2000; and U.S. Ser. No.60/201,721, filedMay 3, 2000, the disclosures of which are incorporated by referenceherein.

FIELD OF THE INVENTION

The present invention relates generally to methods and compositions forthe detection and/or treatment of breast cancer. More specifically, thepresent invention relates to breast cancer-associated proteins andnucleic acids encoding such proteins which represent cellular markersfor breast cancer detection, and molecular targets for breast cancertherapy.

BACKGROUND OF THE INVENTION

Breast cancer is a leading cause of death in women. While thepathogenesis of breast cancer is unclear, transformation of normalbreast epithelium to a malignant phenotype may be the result of geneticfactors, especially in women under 30 (Miki et al. (1994) Science 266:66-71). However, it is likely that other, non-genetic factors also havea significant effect on the etiology of the disease. Regardless of itsorigin, breast cancer morbidity increases significantly if it is notdetected early in its progression. Thus, considerable effort has focusedon the elucidation of early cellular events surrounding transformationin breast tissue. Such effort has led to the identification of severalpotential breast cancer markers. For example, alleles of the BRCA1 andBRCA2 genes have been linked to hereditary and early-onset breast cancer(Wooster et al. (1994) Science 265: 2088-2090). The wild-type BRCA1allele encodes a tumor suppressor protein. Deletions and/or otheralterations in that allele have been linked to transformation of breastepithelium. Accordingly, detection of mutated BRCA1 alleles or theirgene products has been proposed as a means for detecting breast, as wellas ovarian, cancers (Miki et al., supra). However, BRCA1 is limited as acancer marker because BRCA1 mutations fail to account for the majorityof breast cancers (Ford et al. (1995) British J. Cancer 72: 805-812).Similarly, the BRCA2 gene, which has been linked to forms of hereditarybreast cancer, accounts for only a small portion of total breast cancercases (Ford et al., supra).

Several other genes have been linked to breast cancer and may serve asmarkers for the disease, either directly or via their gene products.Such potential markers include the TP53 gene and its gene product, thep53 tumor suppressor protein (Malkin et al. (1990) Science 250:1233-1238). The loss of heterozygosity in genes such as the ataxiatelangiectasia gene has also been linked to a high risk of developingbreast cancer (Swift et al. (1991) N. Engl. J. Med. 325: 1831-1836). Aproblem associated with many of the markers proposed to date is that theoncogenic phenotype is often the result of a gene deletion, thusrequiring detection of the absence of the wild-type form as a predictorof transformation.

There is, therefore, a need in the art for specific, reliable markersthat are differentially expressed in normal and transformed breasttissue and that may be useful in the diagnosis of breast cancer, in theprediction of its onset or the treatment of breast cancer. Such markersand methods for their use are provided herein.

SUMMARY OF THE INVENTION

The invention provides a variety of methods and compositions fordetecting the presence of breast cancer in a mammal, for example, ahuman, and for treating breast cancer in a mammal diagnosed with thedisease. The invention is based, in part, upon the discovery of a familyof proteins each member of which is detectable at a higher concentrationin serum from a mammal, for example, a human, with breast cancerrelative to serum from a normal mammal, that is, a mammal without breastcancer. Accordingly, these proteins, as well as nucleic acid sequencesencoding such proteins, or sequences complementary thereto, can be usedas breast cancer markers useful in diagnosing breast cancer, monitoringthe efficacy of a breast cancer therapy and/or as targets of such atherapy.

In one aspect, the invention provides isolated breast cancer-associatedprotein markers. The protein markers are characterized as beingdetectable at a higher concentration in the serum of a mammal,specifically, a human, with breast cancer than in serum of a mammalwithout breast cancer.

One marker protein is further characterized in that it has a molecularweight of about 16 kD, and fails to bind in a detectable amount to ananion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0.This marker protein also has a binding affinity to a nickel SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 17 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 25 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to aWCX-2 SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 30 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 25 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to aWCX-2 SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 35 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 25 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to aWCX-2 SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 20 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 50 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to anickel SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 24 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 50 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to anickel SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 28 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 50 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to anickel SELDI chip. Microsequence analysis has identified the markerprotein to be a protein known in the art as small nuclearribonucleoprotein B″ (Habets et al. (1987) PROC NATL ACAD SCI, USA 84,2421-2425), the amino acid sequence of which is identified hereinbelowas SEQ ID NO: 5.

Another marker protein is further characterized in that it has amolecular weight of about 35 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 50 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to anickel SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 35 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 50 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to anickel SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 18 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 100 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to aWCX-2 SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 71 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 100 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to aWCX-2 SELDI chip. Microsequence analysis has identified the markerprotein to be a protein known in the art as, or related to, the 64 kDsubunit of cleavage stimulating factor (Takagaki et al. (1987) PROC NATLACAD SCI, USA 89, 1403-1407), the amino acid sequence of which isidentified hereinbelow as SEQ ID NO: 22 and SEQ ID NO: 23.

Another marker protein is further characterized in that it has amolecular weight of about 12 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 150 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to aSAX-2 SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 42 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 200 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to anickel SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 56 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 200 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to anickel SELDI chip.

Another marker protein is further characterized in that it has amolecular weight of about 35 kD, binds to an anion exchange resin in thepresence of 50 mM sodium phosphate, pH 7.0, and elutes from the anionexchange resin in the presence of 400 mM sodium chloride in 50 mM sodiumphosphate, pH 7.0. This marker protein also has a binding affinity to acopper SELDI chip.

Furthermore, the aforementioned breast cancer-associated proteins arefurther characterized as being non-immunoglobulin and/or non-albuminproteins. Furthermore, the breast cancer-associated proteins may furtherdefine an antigenic region or epitope that may bind specifically to abinding moiety, for example, an antibody, for example, a monoclonal or apolyclonal antibody, an antibody fragment thereof, or a biosyntheticantibody binding site directed against the antigenic region or epitope.In addition, the invention enables one skilled in the art to isolatenucleic acids encoding the aforementioned breast cancer-associatedproteins or nucleic acids capable of hybridizing under specifichybridization conditions to a nucleic acid encoding the breastcancer-associated proteins. Furthermore, the skilled artisan may producenucleic acid sequences encoding the entire isolated marker protein, orfragments thereof, using methods currently available in the art (see,for example, Sambrook et al., eds. (1989) “Molecular Cloning: ALaboratory Manual,” Cold Spring Harbor Press). For example, the breastcancer-associated protein of the invention, when isolated, can besequenced using conventional peptide sequencing protocols. Based on thepeptide sequence, it is possible to produce oligonucleotidehybridization probes useful in screening a cDNA library. The cDNAlibrary may then be screened with the resultant oligonucleotide toisolate full or partial length cDNA sequences encoding the isolatedprotein.

In another aspect, the invention provides a variety of methods, forexample, protein or nucleic acid-based methods, for detecting thepresence of breast cancer in a mammal. The methods of the invention maybe performed on any relevant tissue or body fluid sample. For example,methods of the invention may be performed on breast tissue, morepreferably breast biopsy tissue. Alternatively, the methods of theinvention may be performed on a human body fluid sample selected fromthe group consisting of: blood; serum; plasma; fecal matter; urine;vaginal secretion; spinal fluid; saliva; ascitic fluid; peritonealfluid; sputum; and breast exudate. It is contemplated, however, that themethods of the invention also may be useful in detecting metastasizedbreast cancer cells in other tissue or body fluid samples. Detection ofbreast cancer can be accomplished using any one of a number of assaymethods well known and used in the art.

In one aspect, the method of diagnosing cancer in an individualcomprises contacting a sample from the individual with a first bindingmoiety that binds specifically to a breast-cancer associated protein toproduce a first binding moiety-cancer-associated protein complex. Thefirst binding moiety is capable of binding specifically to at least oneof the breast cancer associated marker proteins identified hereinaboveto produce a complex. Thereafter the presence and/or amount of markerprotein in the complex can then be detected, for example, via the firstbinding moiety if labeled with a detectable moiety, for example, aradioactive or fluorescent label, or a second binding moiety labeledwith a detectable moiety that binds specifically to the first bindingmoiety using conventional methodologies well known in the art. Thepresence or amount of the marker protein can thus be indicative of thepresence of breast cancer in the individual. For example, the amount ofmarker protein in the sample may be compared against a threshold valuepreviously calibrated to indicate the presence or absence of breastcancer, wherein the amount of the complex in the sample relative to thethreshold value can be indicative of the presence or absence of cancerin the individual. Although such a method can be performed on tissue,for example, breast tissue, or a body fluid, for example, serum, a bodyfluid currently is the preferred test sample.

Detection of the aforementioned nucleic acid molecules can also serve asan indicator of the presence of breast cancer and/or metastasized breastcancer in an individual. Accordingly, in another aspect, the inventionprovides another method for detecting breast cancer in a human. Themethod comprises the step of detecting the presence of a nucleic acidmolecule in a tissue or body fluid sample thereby to indicate thepresence of breast cancer in an individual. The nucleic acid molecule isselected from the group consisting of (i) a nucleic acid moleculecomprising a sequence capable of recognizing and being specificallybound by a breast cancer-associated protein, and (ii) a nucleic acidmolecule comprising a sequence encoding at least a portion of one ormore of the breast cancer-associated proteins identified herein.

In one embodiment, the method comprises exposing a sample from theindividual under specific hybridization conditions to a nucleic acidprobe, for example, greater than about 10 and more preferably greaterthan 15 nucleotides in length, capable of hybridizing to a targetnucleic acid encoding one of the breast cancer-associated proteinsidentified herein to produce a duplex. Thereafter, the presence of theduplex can be detected using a variety of detection methods known andused in the art. It is contemplated that the target nucleic acid may beamplified, for example, via conventional polymerase chain reaction (PCR)or reverse transcriptase polymerase chain reaction (RT-PCR)methodologies, prior to hybridization with the nucleic acid probe.

In one embodiment, the target nucleic acid (for example, a messenger RNA(mRNA) molecule), is greater than 15 nucleotides, more preferablygreater than 50 nucleotides, and most preferably greater than 100nucleotides in length and encodes an amino acid sequence present in oneof the breast cancer-associated proteins identified herein. Such atarget mRNA may then be detected, for example, by Northern blot analysisby reacting the sample with a labeled hybridization probe, for example,a ³²P labeled oligonucleotide probe, capable of hybridizing specificallywith at least a portion of the nucleic acid molecule encoding the markerprotein. Detection of a nucleic acid molecule either encoding a breastcancer-associated protein or capable of being specifically bound by abreast cancer-associated protein, can thus serve as an indicator of thepresence of a breast cancer in the individual being tested.

In another aspect, the invention provides a kit for detecting thepresence of breast cancer or for evaluating the efficacy of atherapeutic treatment of a breast cancer. Such kits may comprise, incombination, (i) a receptacle for receiving a human tissue or body fluidsample from the individual to be tested, (ii) a binding partner whichbinds specifically either to an epitope on a breast cancer-associatedmarker protein or a nucleic acid sequence encoding at least a portion ofthe breast cancer-associated protein or the nucleic acid sequenceencoding at least a portion of the breast cancer-associated protein, and(iii) a reference sample. In one embodiment, the reference sample maycomprise a negative and/or positive control. In that embodiment, thenegative control would be indicative of a normal breast cell type andthe positive control would be indicative of breast cancer.

In another aspect, the invention provides methods and compositions fortreating breast cancer. In one aspect the invention provides proteins ornucleobase-containing sequences useful in the treatment of breastcancer. The therapeutic protein could be, for example, a binding moiety,for example, an antibody, for example, a monoclonal antibody, anantigenic binding fragment thereof, or a biosynthetic antibody bindingsite capable of binding specifically to a breast cancer-associatedprotein identified herein. The method comprises the step ofadministering to a patient with breast cancer, atherapeutically-effective amount of a compound, preferably an antibody,and most preferably a monoclonal antibody, which binds specifically to atarget breast cancer-associated protein thereby to inactivate or reducethe biological activity of the protein. The target protein may be any ofthe breast cancer-associated proteins identified herein. Similarly, itis contemplated that the compound may comprise a small molecule, forexample, a small organic molecule, which inhibits or reduces thebiological activity of the target breast cancer-associated protein.

In another aspect, the invention provides another method for treatingbreast cancer. The method comprises the step of administering to apatient diagnosed as having breast cancer, a therapeutically-effectiveamount of a compound which reduces in vivo the expression of a targetbreast cancer-associated protein thereby to reduce in vivo theexpression of the target protein. In a preferred embodiment, thecompound is a nucleobase containing sequence, for example, an anti-sensenucleic acid sequence or a peptidyl nucleic acid (PNA) capable ofbinding to and reducing the expression (for example, transcription ortranslation) of a nucleic acid encoding at least a portion of at leastone of the breast cancer-associated proteins identified herein. Afteradministration, the anti-sense nucleic acid sequence or the anti-sensePNA molecule binds to the nucleic acid sequences encoding, at least inpart, the target protein thereby to reduce in vivo expression of thetarget breast cancer-associated protein.

Thus, the invention provides a wide range of methods and compositionsfor detecting and treating breast cancer in an individual. Specifically,the invention provides breast cancer-associated proteins, which permitspecific and early, preferably before metastases occur, detection ofbreast cancer in an individual. In addition, the invention provides kitsuseful in the detection of breast cancer in an individual. In addition,the invention provides methods utilizing the breast cancer-associatedproteins as targets and indicators, for treating breast cancers and formonitoring of the efficacy of such a treatment. These and other numerousadditional aspects and advantages of the invention will become apparentupon consideration of the following figures, detailed description, andclaims which follow.

DESCRIPTION OF THE DRAWINGS

The invention can be more completely understood with reference to thefollowing drawings, in which:

FIGS. 1A-1C are spectra resulting from the characterization via massspectrometry of 28 kD proteins subjected to trypsin digestion and elutedfrom a polyacrylamide gel. FIG. 1A is a spectrum of the heaviest 28 kDprotein isolated from the gel, FIG. 1B is a spectrum of the median 28 kDprotein isolated from the gel, and FIG. 1C is a spectrum of the lightest28 kD protein isolated from the gel.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides methods and compositions for thedetection and treatment of breast cancer. The invention is based, inpart, upon the discovery of breast cancer-associated proteins whichgenerally are present at detectably higher levels in serum of humanswith breast cancer relative to serum of humans without breast cancer.

The breast cancer-associated proteins or nucleic acids encoding suchproteins may act as markers useful in the detection of breast cancer oras targets for therapy of breast cancer. For example, it is contemplatedthat the marker proteins and binding moieties, for example, antibodiesthat bind to the marker proteins or nucleic acid probes which hybridizeto nucleic acid sequences encoding the marker proteins, may be used todetect the presence of breast cancer in an individual. Furthermore, itis contemplated that the skilled artisan may produce novel therapeuticsfor treating breast cancer which include, for example: antibodies whichcan be administered to an individual that bind to and reduce oreliminate the biological activity of the target protein in vivo; nucleicacid or peptidyl nucleic acid sequences which hybridize with genes orgene transcripts encoding the target proteins, thereby to reduceexpression of the target proteins in vivo; or small molecules, forexample, organic molecules which interact with the target proteins orother cellular moieties, for example, receptors for the target proteins,thereby to reduce or eliminate biological activity of the targetproteins.

Set forth below are methods for isolating breast cancer-associatedproteins, methods for detecting breast cancer using breastcancer-associated proteins as markers, and methods for treatingindividuals afflicted with breast cancer using breast cancer-associatedproteins as targets for cancer therapy.

1. Methods for Detecting Breast Cancer-Associated Marker Proteins.

Marker proteins of the invention, as disclosed herein, are identified bycomparing the protein composition of serum of a human diagnosed withbreast cancer with the protein composition of serum of a human free ofbreast cancer. As used herein, the term “breast cancer-associatedprotein” is understood to mean any protein which is detectable at ahigher level in a tissue or body fluid of an individual diagnosed withbreast cancer relative to a corresponding tissue or body fluid of anindividual free of breast cancer and includes species and allelicvariants thereof and fragments thereof. As used herein, the term “breastcancer” is understood to mean any cancer or cancerous lesion associatedwith breast tissue or breast tissue cells and can include precursors tobreast cancer, for example, atypical ductal hyperplasia or non-atypicalhyperplasia. It is not necessary that the marker protein or targetmolecule be unique to a breast cancer cell or body fluid of anindividual afflicted with breast cancer; rather the marker protein ortarget molecule should have a signal to noise ratio high enough todiscriminate between samples originating from a breast cancer tissue orbody fluid and samples originating from normal breast tissue or bodyfluid.

As used herein, a “portion” or a “fragment” of a protein or of an aminoacid sequence denotes a contiguous peptide comprising, in sequence, atleast ten amino acids from the protein or amino acid sequence (e.g.amino acids 1-10, 34-43, or 127-136 of the protein or sequence).Preferably, the peptide comprises, in sequence, at least twenty aminoacids from the protein or amino acid sequence. More preferably, thepeptide comprises, in sequence, at least forty amino acids from theprotein or amino acid sequence.

The breast cancer-associated marker proteins of the invention wereidentified by comparing the proteins present in the serum of individualswith breast cancer to the proteins present in the serum of individualswithout breast cancer. Albumin and immunoglobulin proteins were removedfrom the serum, and the proteins were separated into twelve fractions byanion exchange chromatography. Briefly, the proteins were loaded on astrong anion exchange column in the presence of 50 mM sodium phosphate,pH 7.0, and eluted with a stepwise gradient of sodium chloride in 50 mMsodium phosphate, pH 7.0. The resulting twelve fractions include aflow-through fraction, a fraction eluting in 25 mM sodium chloride, a 50mM fraction, a 75 mM fraction, a 100 mM fraction, a 125 mM fraction, a150 mM fraction, a 200 mM fraction, a 250 mM fraction, a 300 mMfraction, a 400 mM fraction, and a 2 M fraction.

Each fraction was analyzed by SELDI (surface-enhanced laser desorptionand ionization) mass spectrometry. Samples from each of the twelvefractions were applied to one of four different SELDI chip surfaces. Acopper or nickel SELDI surface can be generated by adding a copper ornickel salt solution to a chip comprising ethylenediaminetriacetic acid.Other SELDI chip surfaces include: WCX-2 which comprises carboxylatemoieties, and SAX-2 which comprises quarternary ammonium moieties. Thebreast cancer-associated proteins of the invention can therefore becharacterized by their increased presence in serum of individuals havingbreast cancer relative to individuals without breast cancer, theirmolecular weight, binding and elution characteristics on an anionexchange resin, and their affinity to a particular SELDI chip. Forexample, as used herein, the term “affinity” to a particular SELDI chipis understood to mean that the breast cancer-associated proteins of theinvention bind preferentially to one type of SELDI chip (e.g., copperSELDI chip) relative to one or more of the other SELDI chips (e.g., thenickel, SAX-2 and WCX-2 chips) disclosed herein. As discussed in detailin Example 1, comparison of the sera from diseased and healthyindividuals revealed a number of proteins frequently present atdetectable levels in the sera of diseased individuals, but infrequentlypresent at comparable levels in the sera of healthy individuals.

Once the breast cancer-associated proteins have been identified by massspectroscopy, the identified proteins can be isolated by standardprotein isolation methodologies and sequenced using protein sequencingtechnologies known and used in the art. See, for example, Examples 5 and6. Once the amino acid sequences are identified then nucleic acidsencoding the marker proteins or portions thereof can be identified usingconventional recombinant DNA methodologies. See, for example, Sambrooket al. eds. (1989) “Molecular Cloning: A Laboratory Manual”, Cold SpringHarbor Press. For example, an isolated breast cancer-associated proteincan be sequenced using conventional peptide sequencing protocols, andthe oligonucleotide hybridization probes designed for sequencing a cDNAlibrary. The cDNA library may then be screened with the resultanthybridization probes to isolate full length or partial length cDNAsequences encoding the isolated marker proteins.

Marker proteins useful in the present invention encompass not only theparticular sequences identified herein but also allelic variants thereofand related proteins that also function as marker proteins. Thus, forexample, sequences that result from alternative splice forms,post-translational modification, or gene duplication are eachencompassed by the present invention. Species variants are alsoencompassed by this invention where the patient is a non-human mammal.Other homologous proteins that may function as marker proteins are alsoenvisioned. Preferably, variant sequences are at least 80% similar or70% identical, more preferably at least 90% similar or 80% identical,and most preferably 95% similar or 90% identical to at least a portionof one of the sequences disclosed herein.

To determine whether a candidate peptide region has the requisitepercentage similarity or identity to a reference polypeptide or peptideoligomer, the candidate amino acid sequence and the reference amino acidsequence are first aligned using the dynamic programming algorithmdescribed in Smith and Waterman (1981), J. Mol. Biol. 147:195-197, incombination with the BLOSUM62 substitution matrix described in FIG. 2 ofHenikoff and Henikoff (1992), “Amino acid substitution matrices fromprotein blocks”, PNAS (November 1992), 89:10915-10919. For the presentinvention, an appropriate value for the gap insertion penalty is -12,and an appropriate value for the gap extension penalty is −4. Computerprograms performing alignments using the algorithm of Smith-Waterman andthe BLOSUM62 matrix, such as the GCG program suite (Oxford MolecularGroup, Oxford, England), are commercially available and widely used bythose skilled in the art.

Once the alignment between the candidate and reference sequence is made,a percent similarity score may be calculated. The individual amino acidsof each sequence are compared sequentially according to their similarityto each other. If the value in the BLOSUM62 matrix corresponding to thetwo aligned amino acids is zero or a negative number, the pairwisesimilarity score is zero; otherwise the pairwise similarity score is1.0. The raw similarity score is the sum of the pairwise similarityscores of the aligned amino acids. The raw score is then normalized bydividing it by the number of amino acids in the smaller of the candidateor reference sequences. The normalized raw score is the percentsimilarity. Alternatively, to calculate a percent identity, the alignedamino acids of each sequence are again compared sequentially. If theamino acids are non-identical, the pairwise identity score is zero;otherwise the pairwise identity score is 1.0. The raw identity score isthe sum of the identical aligned amino acids. The raw score is thennormalized by dividing it by the number of amino acids in the smaller ofthe candidate or reference sequences. The normalized raw score is thepercent identity. Insertions and deletions are ignored for the purposesof calculating percent similarity and identity. Accordingly, gappenalties are not used in this calculation, although they are used inthe initial alignment.

In all instances, variants of the naturally-occurring sequences, asdescribed above, must be tested for their function as marker proteins.Specifically, their presence or absence in a particular form or in aparticular biological compartment must be indicative of the presence orabsence of cancer in an individual. This routine experimentation can becarried out by the methods described hereinbelow or by other methodsknown in the art.

Marker proteins in a sample of tissue or body fluid may be detected viabinding assays, wherein a binding partner for the marker protein isintroduced into a sample suspected of containing the marker protein. Insuch an assay, the binding partner may be detectably labeled as, forexample, with a radioisotopic or fluorescent marker. Labeled antibodiesmay be used in a similar manner in order to isolate selected markerproteins. Nucleic acids encoding marker proteins may be detected usingnucleic acid probes having a sequence complementary to at least aportion of the sequence encoding the marker protein. Techniques such asPCR and, in particular, reverse transcriptase PCR, are useful means forisolating nucleic acids encoding a marker protein. The examples whichfollow provide details of the isolation and characterization of breastcancer-associated proteins and methods for their use in the detectionand treatment of breast cancer.

2. Detection of Breast Cancer

Once breast cancer-associated proteins have been identified, theproteins or nucleic acids encoding the proteins may be used as markersto determine whether an individual has breast cancer and, if so,suitable detection methods can be used to monitor the status of thedisease.

Using the marker proteins or nucleic acids encoding the proteins, theskilled artisan can produce a variety of detection methods for detectingbreast cancer in a human. The methods typically comprise the steps ofdetecting, by some means, the presence of one or more breastcancer-associated proteins or nucleic acids encoding such proteins in atissue or body fluid sample of the human. The accuracy and/orreliability of the method for detecting breast cancer in a human may befurther enhanced by detecting the presence of a plurality of breastcancer-associated proteins and/or nucleic acids in a preselected tissueor body fluid sample. The detection assays may comprise one or more ofthe protocols described hereinbelow.

2.A. Protein-Based Assays

The marker protein in a sample may be detected, for example, bycombining the marker protein with a binding moiety capable ofspecifically binding the marker protein. The binding moiety maycomprise, for example, a member of a ligand-receptor pair, i.e., a pairof molecules capable of having a specific binding interaction. Thebinding moiety may comprise, for example, a member of a specific bindingpair, such as antibody-antigen, enzyme-substrate, nucleic acid-nucleicacid, protein-nucleic acid, protein-protein, or other specific bindingpair known in the art. Binding proteins may be designed which haveenhanced affinity for a target protein. Optionally, the binding moietymay be linked with a detectable label, such as an enzymatic,fluorescent, radioactive, phosphorescent or colored particle label. Thelabeled complex may be detected, e.g., visually or with the aid of aspectrophotometer or other detector.

Marker proteins may also be detected using gel electrophoresistechniques available in the art. In two-dimensional gel electrophoresis,the proteins are separated first in a pH gradient gel according to theirisoelectric point. The resulting gel then is placed on a secondpolyacrylamide gel, and the proteins separated according to molecularweight (see, for example, O° Farrell (1975) J. Biol. Chem. 250:4007-4021).

One or more marker proteins may be detected by first isolating proteinsfrom a sample obtained from an individual suspected of having breastcancer, and then separating the proteins by two-dimensional gelelectrophoresis to produce a characteristic two-dimensional gelelectrophoresis pattern. The pattern may then be compared with astandard gel pattern produced by separating, under the same or similarconditions, proteins isolated from normal or cancer cells. The standardgel pattern may be stored in, and retrieved from an electronic databaseof electrophoresis patterns. The presence of a breast cancer-associatedprotein in the two-dimensional gel provides an indication that thesample being tested was taken from a person with breast cancer. As withthe other detection assays described herein, the detection of two ormore proteins, for example, in the two-dimensional gel electrophoresispattern further enhances the accuracy of the assay. The presence of aplurality, e.g., two to five, breast cancer-associated proteins on thetwo-dimensional gel provides an even stronger indication of the presenceof a breast cancer in the individual. The assay thus permits the earlydetection and treatment of breast cancer.

A breast cancer-associated marker protein may also be detected using anyof a wide range of immunoassay techniques available in the art. Forexample, the skilled artisan may employ the sandwich immunoassay formatto detect breast cancer in a body fluid sample. Alternatively, theskilled artisan may use conventional immuno-histochemical procedures fordetecting the presence of the breast cancer-associated protein in atissue sample using one or more labeled binding proteins.

In a sandwich immunoassay, two antibodies capable of binding the markerprotein generally are used, e.g., one immobilized onto a solid support,and one free in solution and labeled with a detectable chemicalcompound. Examples of chemical labels that may be used for the secondantibody include radioisotopes, fluorescent compounds, and enzymes orother molecules that generate colored or electrochemically activeproducts when exposed to a reactant or enzyme substrate. When a samplecontaining the marker protein is placed in this system, the markerprotein binds to both the immobilized antibody and the labeled antibody,to form a “sandwich” immune complex on the support's surface. Thecomplexed protein is detected by washing away non-bound samplecomponents and excess labeled antibody, and measuring the amount oflabeled antibody complexed to protein on the support's surface.Alternatively, the antibody free in solution, which can be labeled witha chemical moiety, for example, a hapten, may be detected by a thirdantibody labeled with a detectable moiety which binds the free antibodyor, for example, the hapten coupled thereto.

Both the sandwich immunoassay and tissue immunohistochemical proceduresare highly specific and very sensitive, provided that labels with goodlimits of detection are used. A detailed review of immunological assaydesign, theory and protocols can be found in numerous texts in the art,including “Practical Immunology”, Butt, W. R., ed., (1984) MarcelDekker, New York and “Antibodies, A Laboratory Approach”, Harlow et al.eds. (1988) Cold Spring Harbor Laboratory.

In general, immunoassay design considerations include preparation ofantibodies (e.g., monoclonal or polyclonal antibodies) havingsufficiently high binding specificity for the target protein to form acomplex that can be distinguished reliably from products of nonspecificinteractions. As used herein, the term “antibody” is understood to meanbinding proteins, for example, antibodies or other proteins comprisingan immunoglobulin variable region-like binding domain, having theappropriate binding affinities and specificities for the target protein.The higher the antibody binding specificity, the lower the targetprotein concentration that can be detected. As used herein, the terms“specific binding” or “binding specifically” are understood to mean thatthe binding moiety, for example, a binding protein has a bindingaffinity for the target protein of greater than about 10⁵ M⁻¹, morepreferably greater than about 10⁷ M⁻¹.

Antibodies to an isolated target breast cancer-associated protein whichare useful in assays for detecting a breast cancer in an individual maybe generated using standard immunological procedures well known anddescribed in the art. See, for example, Practical Immunology, Butt, N.R., ed., Marcel Dekker, NY, 1984. Briefly, an isolated target protein isused to raise antibodies in a xenogeneic host, such as a mouse, goat orother suitable mammal. The marker protein is combined with a suitableadjuvant capable of enhancing antibody production in the host, and isinjected into the host, for example, by intraperitoneal administration.Any adjuvant suitable for stimulating the host's immune response may beused. A commonly used adjuvant is Freund's complete adjuvant (anemulsion comprising killed and dried microbial cells and available from,for example, Calbiochem Corp., San Diego, or Gibco, Grand Island, N.Y.).Where multiple antigen injections are desired, the subsequent injectionsmay comprise the antigen in combination with an incomplete adjuvant(e.g., cell-free emulsion). Polyclonal antibodies may be isolated fromthe antibody-producing host by extracting serum containing antibodies tothe protein of interest. Monoclonal antibodies may be produced byisolating host cells that produce the desired antibody, fusing thesecells with myeloma cells using standard procedures known in theimmunology art, and screening for hybrid cells (hybridomas) that reactspecifically with the target protein and have the desired bindingaffinity.

Antibody binding domains also may be produced biosynthetically and theamino acid sequence of the binding domain manipulated to enhance bindingaffinity with a preferred epitope on the target protein. Specificantibody methodologies are well understood and described in theliterature. A more detailed description of their preparation can befound, for example, in “Practical Immunology” (1984) (supra).

In addition, genetically engineered biosynthetic antibody binding sites,also known in the art as BABS or sFv's, may be used in the practice ofthe instant invention. Methods for making and using BABS comprising (i)non-covalently associated or disulfide bonded synthetic V_(H) and V_(L)dimers, (ii) covalently linked V_(H)-V_(L) single chain binding sites,(iii) individual V_(H) or V_(L) domains, or (iv) single chain antibodybinding sites are disclosed, for example, in U.S. Pat. Nos. 5,091,513;5,132,405; 4,704,692; and 4,946,778. Furthermore, BABS having requisitespecificity for the breast cancer-associated proteins can be derived byphage antibody cloning from combinatorial gene libraries (see, forexample, Clackson et al. (1991) Nature 352: 624-628). Briefly, phageeach expressing on their coat surfaces BABS having immunoglobulinvariable regions encoded by variable region gene sequences derived frommice pre-immunized with isolated breast cancer-associated proteins, orfragments thereof, are screened for binding activity against immobilizedbreast cancer-associated protein. Phage which bind to the immobilizedbreast cancer-associated proteins are harvested and the gene encodingthe BABS is sequenced. The resulting nucleic acid sequences encoding theBABS of interest then may be expressed in conventional expressionsystems to produce the BABS protein.

The isolated breast cancer-associated protein also may be used for thedevelopment of diagnostic and other tissue evaluating kits and assays tomonitor the level of the proteins in a tissue or fluid sample. Forexample, the kit may include antibodies or other specific bindingproteins which bind specifically to the breast cancer-associatedproteins and which permit the presence and/or concentration of thebreast cancer-associated proteins to be detected and/or quantitated in atissue or fluid sample.

Suitable kits for detecting breast cancer-associated proteins arecontemplated to include, e.g., a receptacle or other means for capturinga sample to be evaluated, and means for detecting the presence and/orquantity in the sample of one or more of the breast cancer-associatedproteins described herein. As used herein, “means for detecting” in oneembodiment includes one or more antibodies specific for these proteinsand means for detecting the binding of the antibodies to these proteinsby, e.g., a standard sandwich immunoassay as described herein. Where thepresence of a protein within a cell is to be detected, e.g., as from atissue sample, the kit also may comprise means for disrupting the cellstructure so as to expose intracellular proteins.

2.B. Nucleic Acid-based Assays

The presence of a breast cancer in an individual also may be determinedby detecting, in a tissue or body fluid sample, a nucleic acid moleculeencoding a breast cancer-associated protein. Using methods well known tothose of ordinary skill in the art, the breast cancer-associatedproteins of the invention may be sequenced, and then, based on thedetermined sequence, oligonucleotide probes designed for screening acDNA library (see, for example, Sambrook et al. (1989) supra).

A target nucleic acid molecule encoding a marker breastcancer-associated protein may be detected using a labeled binding moietycapable of specifically binding the target nucleic acid. The bindingmoiety may comprise, for example, a protein, a nucleic acid or a peptidenucleic acid. Additionally, a target nucleic acid, such as an mRNAencoding a breast cancer-associated protein, may be detected byconducting, for example, a Northern blot analysis using labeledoligonucleotides, e.g., nucleic acid fragments complementary to andcapable of hybridizing specifically with at least a portion of a targetnucleic acid.

More specifically, gene probes comprising complementary RNA or,preferably, DNA to the breast cancer-associated nucleotide sequences ormRNA sequences encoding breast cancer-associated proteins may beproduced using established recombinant techniques or oligonucleotidesynthesis. The probes hybridize with complementary nucleic acidsequences presented in the test specimen, and can provide exquisitespecificity. A short, well-defined probe, coding for a single uniquesequence is most precise and preferred. Larger probes are generally lessspecific. While an oligonucleotide of any length may hybridize to anmRNA transcript, oligonucleotides typically within the range of 8-100nucleotides, preferably within the range of 15-50 nucleotides, areenvisioned to be most useful in standard hybridization assays. Choicesof probe length and sequence allow one to choose the degree ofspecificity desired. Hybridization is carried out at from 50° to 65° C.in a high salt buffer solution, formamide or other agents to set thedegree of complementarity required. Furthermore, the state of the art issuch that probes can be manufactured to recognize essentially any DNA orRNA sequence. For additional particulars, see, for example, Guide toMolecular Techniques, Berger et al., Methods of Enzymology, Vol. 152,1987.

A wide variety of different labels coupled to the probes or antibodiesmay be employed in the assays. The labeled reagents may be provided insolution or coupled to an insoluble support, depending on the design ofthe assay. The various conjugates may be joined covalently ornoncovalently, directly or indirectly. When bonded covalently, theparticular linkage group will depend upon the nature of the two moietiesto be bonded. A large number of linking groups and methods for linkingare taught in the literature. Broadly, the labels may be divided intothe following categories: chromogens; catalyzed reactions;chemiluminescence; radioactive labels; and colloidal-sized coloredparticles. The chromogens include compounds which absorb light in adistinctive range so that a color may be observed, or emit light whenirradiated with light of a particular wavelength or wavelength range,e.g., fluorescers. Both enzymatic and nonenzymatic catalysts may beemployed. In choosing an enzyme, there will be many considerationsincluding the stability of the enzyme, whether it is normally present insamples of the type for which the assay is designed, the nature of thesubstrate, and the effect if any of conjugation on the enzyme'sproperties. Potentially useful enzyme labels include oxiodoreductases,transferases, hydrolases, lyases, isomerases, ligases, or synthetases.Interrelated enzyme systems may also be used. A chemiluminescent labelinvolves a compound that becomes electronically excited by a chemicalreaction and may then emit light that serves as a detectable signal ordonates energy to a fluorescent acceptor. Radioactive labels includevarious radioisotopes found in common use such as the unstable forms ofhydrogen, iodine, phosphorus or the like. Colloidal-sized coloredparticles involve material such as colloidal gold that, in aggregate,form a visually detectable distinctive spot corresponding to the site ofa substance to be detected. Additional information on labelingtechnology is disclosed, for example, in U.S. Pat. No. 4,366,241.

A common method of in vitro labeling of nucleotide probes involves nicktranslation wherein the unlabeled DNA probe is nicked with anendonuclease to produce free 3′hydroxyl termini within either strand ofthe double-stranded fragment. Simultaneously, an exonuclease removes thenucleotide residue from the 5′phosphoryl side of the nick. The sequenceof replacement nucleotides is determined by the sequence of the oppositestrand of the duplex. Thus, if labeled nucleotides are supplied, DNApolymerase will fill in the nick with the labeled nucleotides. Usingthis well-known technique, up to 50% of the molecule can be labeled. Forsmaller probes, known methods involving 3′end labeling may be used.Furthermore, there are currently commercially available methods oflabeling DNA with fluorescent molecules, catalysts, enzymes, orchemiluminescent materials. Biotin labeling kits are commerciallyavailable (Enzo Biochem Inc.) under the trademark Bio-Probe. This typeof system permits the probe to be coupled to avidin which in turn islabeled with, for example, a fluorescent molecule, enzyme, antibody,etc. For further disclosure regarding probe construction and technology,see, for example, Sambrook et al., Molecular Cloning, A LaboratoryManual (Cold Spring Harbor, N.Y., 1982).

The oligonucleotide selected for hybridizing to the target nucleic acid,whether synthesized chemically or by recombinant DNA methodologies, isisolated and purified using standard techniques and then preferablylabeled (e.g., with ³⁵S or ³²P) using standard labeling protocols. Asample containing the target nucleic acid then is run on anelectrophoresis gel, the dispersed nucleic acids transferred to anitrocellulose filter and the labeled oligonucleotide exposed to thefilter under stringent hybridizing conditions, e.g., 50% formamide, 5×SSPE, 2× Denhardt's solution, 0.1% SDS at 42° C., as described inSambrook et al. (1989) supra. The filter may then be washed using 2×SSPE, 0.1% SDS at 68° C., and more preferably using 0.1× SSPE, 0.1% SDSat 68° C. Other useful procedures known in the art include solutionhybridization, and dot and slot RNA hybridization. Optionally, theamount of the target nucleic acid present in a sample is thenquantitated by measuring the radioactivity of hybridized fragments,using standard procedures known in the art.

In addition, oligonucleotides also may be used to identify othersequences encoding members of the target protein families. Themethodology also may be used to identify genetic sequences associatedwith the nucleic acid sequences encoding the proteins described herein,e.g., to identify non-coding sequences lying upstream or downstream ofthe protein coding sequence, and which may play a functional role inexpression of these genes. Additionally, binding assays may be conductedto identify and detect proteins capable of a specific bindinginteraction with a nucleic acid encoding a breast cancer-associatedprotein, which may be involved, e.g., in gene regulation or geneexpression of the protein. In a further embodiment, the assays describedherein may be used to identify and detect nucleic acid moleculescomprising a sequence capable of recognizing and being specificallybound by a breast cancer-associated protein.

In addition, it is anticipated that using a combination of appropriateoligonucleotide primers, i.e., more than one primer, the skilled artisanmay determine the level of expression of a target gene in vivo bystandard polymerase chain reaction (PCR) procedures, for example, byquantitative PCR. Conventional PCR based assays are discussed, forexample, in Innes et al (1990) “PCR Protocols; A guide to methods andApplications”, Academic Press and Innes et al. (1995) “PCR Strategies”Academic Press, San Diego, Calif.

3. Identification of Proteins Which Interact In Vivo With BreastCancer-Associated Proteins

In addition, it is contemplated that the skilled artisan, usingprocedures like those described hereinbelow, may identify othermolecules which interact in vivo with the breast cancer-associatedproteins described herein. Such molecules also may provide possibletargets for chemotherapy.

By way of example, cDNA encoding proteins or peptides capable ofinteracting with breast cancer-associated proteins can be determinedusing a two-hybrid assay, as reported in Durfee et al. (1993) Genes &Develop. 7: 555-559. The principle of the two hybrid system is thatnoncovalent interaction of two proteins triggers a process(transcription) in which these proteins normally play no direct role,because of their covalent linkage to domains that function in thisprocess. For example, in the two-hybrid assay, detectable expression ofa reporter gene occurs when two fusion proteins, one comprising aDNA-binding domain and one comprising a transcription initiation domain,interact.

The skilled artisan can use a host cell that contains one or morereporter genes, such as yeast strain Y153, reported in Durfee et al.(1993) supra. This strain carries two chromosomally located reportergenes whose expression is regulated by Gal4. A first reporter gene, isthe E. coli lacZ gene under the control of the Gal4 promoter. A secondreporter gene is the selectable HIS3 gene. Other useful reporter genesmay include, for example, the luciferase gene, the LEU2 gene, and theGFP (Green Fluorescent Protein) gene.

Two sets of plasmids are used in the two hybrid system. One set ofplasmids contains DNA encoding a Gal4 DNA-binding domain fused in frameto DNA encoding a breast cancer-associated protein. The other set ofplasmids contain DNA encoding a Gal4 activation domain fused to portionsof a human cDNA library constructed from human lymphocytes. Expressionfrom the first set of plasmids results in a fusion protein comprising aGal4 DNA-binding domain and a breast cancer-associated protein.Expression from the second set of plasmids produces a transcriptionactivation protein fused to an expression product from the lymphocytecDNA library. When the two plasmids are transformed into aGal4-deficient host cell, such as the yeast Y153 cells described above,interaction of the Gal4 DNA binding domain and transcription activationdomain occurs only if the breast cancer-associated protein fused to theDNA binding domain binds to a protein expressed from the lymphocyte cDNAlibrary fused to the transcription activating domain. As a result of theprotein-protein interaction between the breast cancer-associated proteinand its in vivo binding partner detectable levels of reporter geneexpression occur.

In addition to identifying molecules which interact in vivo with thebreast cancer-associated proteins, the skilled artisan may also screenfor molecules, for example, small molecules which alter or inhibitspecific interaction between a breast cancer-associated protein and itsin vivo binding partner.

For example, a host cell can be transfected with DNA encoding a suitableDNA binding domain/breast cancer-associated protein hybrid and atranslation activation domain/putative breast cancer-associated proteinbinding partner, as disclosed above. The host cell also contains asuitable reporter gene in operative association with a cis-actingtranscription activation element that is recognized by the transcriptionfactor DNA binding domain. The level of reporter gene expressed in thesystem is assayed. Then, the host cell is exposed to a candidatemolecule and the level of reporter gene expression is detected. Areduction in reporter gene expression is indicative of the candidate'sability to interfere with complex formation or stability with respect tothe breast cancer-associated protein and its in vivo binding partner. Asa control, the candidate molecule's ability to interfere with other,unrelated protein-protein complexes is also tested. Molecules capable ofspecifically interfering with a breast cancer-associated protein/bindingpartner interaction, but not other protein-protein interactions, areidentified as candidates for production and further analysis. Once apotential candidate has been identified, its efficacy in modulating cellcycling and cell replication can be assayed in a standard cell cyclemodel system.

Candidate molecules can be produced as described hereinbelow. Forexample, DNA encoding the candidate molecules can be inserted, usingconventional techniques well described in the art (see, for example,Sambrook (1989) supra) into any of a variety of expression vectors andtransfected into an appropriate host cell to produce recombinantproteins, including both full length and truncated forms. Useful hostcells include E. coli, Saccharomyces cerevisiae, Pichia pastoris, theinsect/baculovirus cell system, myeloma cells, and various othermammalian cells. The full length forms of such proteins are preferablyexpressed in mammalian cells, as disclosed herein. The nucleotidesequences also preferably include a sequence for targeting thetranslated sequence to the nucleus, using, for example, a sequenceencoding the eight amino acid nucleus targeting sequence of the large Tantigen, which is well characterized in the art. The vector canadditionally include various sequences to promote correct expression ofthe recombinant protein, including transcription promoter andtermination sequences, enhancer sequences, preferred ribosome bindingsite sequences, preferred mRNA leader sequences, preferred proteinprocessing sequences, preferred signal sequences for protein secretion,and the like. The DNA sequence encoding the gene of interest can also bemanipulated to remove potentially inhibiting sequences or to minimizeunwanted secondary structure formation. As will be appreciated by thepractitioner in the art, the recombinant protein can also be expressedas a fusion protein.

After translation, the protein can be purified from the cells themselvesor recovered from the culture medium. The DNA can also include sequenceswhich aid in expression and/or purification of the recombinant protein.The DNA can be expressed directly or can be expressed as part of afusion protein having a readily cleavable fusion junction.

The DNA may also be expressed in a suitable mammalian host. Useful hostsinclude fibroblast 3T3 cells, (e.g., NIH 3T3, from CRL 1658) COS (simiankidney ATCC, CRL-1650) or CH0 (Chinese hamster ovary) cells (e.g.,CHO-DXBI 1, from Chasin (1 980) Proc. Nat'l. Acad. Sci. USA 77:4216-4222), mink-lung epithelial cells (MV1Lu), human foreskinfibroblast cells, human glioblastoma cells, and teratocarcinoma cells.Other useful eukaryotic cell systems include yeast cells, theinsect/baculovirus system or myeloma cells.

In order to express a candidate molecule, the DNA is subcloned into aninsertion site of a suitable, commercially available vector along withsuitable promoter/enhancer sequences and 3′ termination sequences.Useful promoter/enhancer sequence combinations include the CMV promoter(human cytomegalovirus (MIE) promoter) present, for example, on pCDM8,as well as the mammary tumor virus promoter (MMTV) boosted by the Roussarcoma virus LTR enhancer sequence (e.g., from Clontech, Inc., PaloAlto). A useful inducable promoter includes, for example, aZn²⁺-inducible promoter, such as the Zn²⁺ metallothionein promoter(Wrana et al. (1992) Cell 71: 1003-1014). Other inducible promoters arewell known in the art and can be used with similar success. Expressionalso can be further enhanced using trans-activating enhancer sequences.The plasmid also preferably contains an amplifiable marker, such as DHFRunder suitable promoter control, e.g., SV40 early promoter (ATCC#37148). Transfection, cell culturing, gene amplification and proteinexpression conditions are standard conditions, well known in the art,such as are described, for example in Ausubel et al., ed., (1989)“Current Protocols in Molecular Biology”, John Wiley & Sons, NY.Briefly, transfected cells are cultured in medium containing 5-10%dialyzed fetal calf serum (dFCS), and stably transfected high expressioncell lines obtained by amplification and subcloning and evaluated bystandard Western and Northern blot analysis. Southern blots also can beused to assess the state of integrated sequences and the extent of theircopy number amplification.

The expressed candidate protein is then purified using standardprocedures. A currently preferred methodology uses an affinity column,such as a ligand affinity column or an antibody affinity column. Thecolumn then is washed, and the candidate molecules selectively eluted ina gradient of increasing ionic strength, changes in pH, or addition ofmild detergent. It is appreciated that in addition to the candidatemolecules which bind to the breast cancer-associated proteins, thebreast cancer associated proteins themselves may likewise be producedusing such recombinant DNA technologies.

4. Breast Cancer Therapy and Methods for Monitoring Therapy

The skilled artisan, after identification of breast cancer-associatedproteins and proteins which interact with the breast cancer-associatedproteins, can develop a variety of therapies for treating breast cancer.Because the marker proteins described herein are present at detectablyhigher levels in breast cancer cells relative to normal breast cells,the skilled artisan may employ, for example, the marker proteins and/ornucleic acids encoding the marker proteins as target molecules for acancer chemotherapy.

4.A. Anti-Sense-Based Therapeutics

A particularly useful cancer therapeutic envisioned is anoligonucleotide or peptide nucleic acid sequence complementary andcapable of hybridizing under physiological conditions to part, or all,of the gene encoding the marker protein or to part, or all, of thetranscript encoding the marker protein thereby to reduce or inhibittranscription and/or translation of the marker protein gene.Alternatively, the same technologies may be applied to reduce or inhibittranscription and/or translation of the proteins which interact with thebreast cancer-associated proteins.

Anti-sense oligonucleotides have been used extensively to inhibit geneexpression in normal and abnormal cells. See, for example, Stein et al.(1 988) Cancer Res. 48: 2659-2668, for a pertinent review of anti-sensetheory and established protocols. In addition, the synthesis and use ofpeptide nucleic acids as anti-sense-based therapeutics are described inPCT publications PCT/EP92/01219 published Nov. 26, 1992, PCT/US92/10921published Jun. 24, 1993, and PCT/US94/013523 published Jun. 1, 1995.Accordingly, the anti-sense-based therapeutics may be used as part ofchemotherapy, either alone or in combination with other therapies.

Anti-sense oligonucleotide and peptide nucleic acid sequences arecapable of hybridizing to a gene and/or mRNA transcript and, therefore,may be used to inhibit transcription and/or translation of the proteindescribed herein. It is appreciated, however, that oligoribonucleotidesequences generally are more susceptible to enzymatic attack byribonucleases than are deoxyribonucleotide sequences. Hence,oligodeoxyribonucleotides are preferred over oligoribonucleotides for invivo therapeutic use. It is appreciated that the peptide nucleic acidsequences, unlike regular nucleic acid sequences, are not susceptible tonuclease degradation and, therefore, are likely to have greaterlongevity in vivo. Furthermore, it is appreciated that peptide nucleicacid sequences bind complementary single stranded DNA and RNA strandsmore strongly than corresponding DNA sequences (see, for example,PCT/EP92/20702 published Nov. 26, 1992). Accordingly, peptide nucleicacid sequences are preferred for in vivo therapeutic use.

Therapeutically useful anti-sense oligonucleotides or peptide nucleicacid sequences may be synthesized by any of the known chemicaloligonucleotide and peptide nucleic acid synthesis methodologies wellknown and thoroughly described in the art. Alternatively, a sequencecomplementary to part or all of the natural mRNA sequence may begenerated using standard recombinant DNA technologies.

Because the complete nucleotide sequence encoding the entire markerprotein as well as additional 5′ and 3′ untranslated sequences are knownfor each of the marker proteins and/or can be determined readily usingtechniques well known in the art, anti-sense oligonucleotides or peptidenucleic acids which hybridize with any portion of the mRNA transcript ornon-coding sequences may be prepared using conventional oligonucleotideand peptide nucleic acid synthesis methodologies.

Oligonucleotides complementary to, and hybridizable with, any portion ofthe mRNA transcripts encoding the marker proteins are, in principle,effective for inhibiting translation of the target proteins as describedherein. For example, as described in U.S. Pat. No. 5,098,890, issuedMar. 24, 1992, oligonucleotides complementary to mRNA at or near thetranslation initiation codon site may be used to inhibit translation.Moreover, it has been suggested that sequences that are too distant inthe 3′ direction from the translation initiation site may be lesseffective in hybridizing the mRNA transcripts because of potentialribosomal “read-through”, a phenomenon whereby the ribosome ispostulated to unravel the anti-sense/sense duplex to permit translationof the message.

A variety of sequence lengths of oligonucleotide or peptide nucleic acidmay be used to hybridize to mRNA transcripts. However, very shortsequences (e.g., sequences containing less than 8-15 nucleobases) maybind with less specificity. Moreover, for in vivo use, shortoligonucleotide sequences may be particularly susceptible to enzymaticdegradation. Peptide nucleic acids, as mentioned above, likely areresistant to nuclease degradation. Where oligonucleotide and peptidenucleic acid sequences are to be provided directly to the cells, verylong sequences may be less effective at inhibition because of decreaseduptake by the target cell. Accordingly, where the oligonucleotide orpeptide nucleic acid is to be provided directly to target cells,oligonucleotide and/or peptide nucleic acid sequences containing about8-50 nucleobases, and more preferably 15-30 nucleobases, are envisionedto be most advantageous.

An alternative means for providing anti-sense oligonucleotide sequencesto a target cell is gene therapy where, for example, a DNA sequence,preferably as part of a vector and associated with a promoter, isexpressed constitutively inside the target cell. Oeller et al. (Oelleret al. (1992) Science 254: 437-539) describe the in vivo inhibition ofthe ACC synthase enzyme using a constitutively expressible DNA sequenceencoding an anti-sense sequence to the full length ACC synthasetranscript. Accordingly, where the anti-sense oligonucleotide sequencesare provided to a target cell indirectly, for example, as part of anexpressible gene sequence to be expressed within the cell, longeroligonucleotide sequences, including sequences complementary tosubstantially all the protein coding sequence, may be used to advantage.

Finally, therapeutically useful oligonucleotide sequences envisionedalso include not only native oligomers composed of naturally occurringnucleotides, but also those comprising modified nucleotides, forexample, to improve stability and lipid solubility and thereby enhancecellular uptake. For example, it is known that enhanced lipid solubilityand/or resistance to nuclease digestion results by substituting a methylgroup or sulfur atom for a phosphate oxygen in the internucleotidephosphodiester linkage. Phosphorothioates (“S-oligonucleotides” whereina phosphate oxygen is replaced by a sulfur atom), in particular, arestable to nuclease cleavage, are soluble in lipids, and are preferred,particularly for direct oligonucleotide administration.S-oligonucleotides may be synthesized chemically using conventionalsynthesis methodologies well known and thoroughly described in the art.

Preferred synthetic internucleoside linkages include phosphorothioates,alkylphosphonates, phosphorodithioates, phosphate esters,alkylphosphonothioates, phosphoramidates, carbamates, carbonates,phosphate triesters, acetamidate, and carboxymethyl esters. Furthermore,one or more of the 5′-3′ phosphate group may be covalently joined to alow molecular weight (e.g., 15-500 Da) organic group, including, forexample, lower alkyl chains or aliphatic groups (e.g., methyl, ethyl,propyl, butyl), substituted alkyl and aliphatic groups (e.g.,aminoethyl, aminopropyl, aminohydroxyethyl, aminohydroxypropyl), smallsaccharides or glycosyl groups. Other low molecular weight organicmodifications include additions to the internucleoside phosphatelinkages such as cholesteryl or diamine compounds with varying numbersof carbon residues between the amino groups and terminal ribose.Oligonucleotides with these linkages or with other modifications can beprepared using methods well known in the art (see, for example, U.S.Pat. No. 5,149,798).

Suitable oligonucleotide and/or peptide nucleic acid sequences whichinhibit transcription and/or translation of the marker proteins can beidentified using standard in vivo assays well characterized in the art.Preferably, a range of doses is used to determine effectiveconcentrations for inhibition as well as specificity of hybridization.For example, in the cases of an oligonucleotide, a dose range of 0-100μg oligonucleotide/ml may be assayed. Further, the oligonucleotides maybe provided to the cells in a single transfection, or as part of aseries of transfections. Anti-sense efficacy may be determined byassaying a change in cell proliferation over time followingtransfection, using standard cell counting methodology and/or byassaying for reduced expression of marker protein, e.g., byimmunofluorescence. Alternatively, the ability of cells to take up anduse thymidine is another standard means of assaying for cell divisionand may be used here, e.g., using ³H-thymidine. Effective anti-senseinhibition should inhibit cell division sufficiently to reduce thymidineuptake, inhibit cell proliferation, and/or reduce detectable levels ofmarker proteins.

It is anticipated that therapeutically effective oligonucleotide orpeptide nucleic acid concentrations may vary according to the nature andextent of the neoplasm, the particular nucleobase sequence used, therelative sensitivity of the neoplasm to the oligonucleotide or peptidenucleic acid sequence, and other factors. Useful ranges for a given celltype and oligonucleotide and/or peptide nucleic acid may be determinedby performing standard dose range experiments. Dose range experimentsalso may be performed to assess toxicity levels for normal and malignantcells. It is contemplated that useful concentrations may range fromabout 1 to 100 μg/ml per 10⁵ cells.

For in vivo use, the anti-sense oligonucleotide or peptide nucleic acidsequences may be combined with a pharmaceutically acceptable carrier,such as a suitable liquid vehicle or excipient, and optionally anauxiliary additive or additives. Liquid vehicles and excipients areconventional and are available commercially. Illustrative thereof aredistilled water, physiological saline, aqueous solutions of dextrose,and the like. For in vivo cancer therapies, the anti-sense sequencespreferably can be provided directly to malignant cells, for example, byinjection directly into the tumor. Alternatively, the oligonucleotide orpeptide nucleic acid may be administered systemically, provided that theanti-sense sequence is associated with means for directing the sequencesto the target malignant cells.

In addition to administration with conventional carriers, the anti-senseoligonucleotide or peptide nucleic acid sequences may be administered bya variety of specialized oligonucleotide delivery techniques. Forexample, oligonucleotides may be encapsulated in liposomes, as describedin Mannino et al. (1988) BioTechnology 6: 682, and Felgner et al. (1989)Bethesda Res. Lab. Focus 11:21. Lipids useful in producing liposomalformulations include, without limitation, monoglycerides, diglycerides,sulfatides, lysolecithin, phospholipids, saponin, bile acids, and thelike. Preparation of such liposomal formulations is within the level ofskill in the art (see, for example, in U.S. Pat. No. 4,235,871; U.S.Pat. No. 4,501,728; U.S. Pat. No. 4,837,028; and U.S. Pat. No.4,737,323). The pharmaceutical composition of the invention may furtherinclude compounds such as cyclodextrins and the like which enhancedelivery of oligonucleotides into cells. When the composition is notadministered systemically but, rather, is injected at the site of thetarget cells, cationic detergents (e.g. Lipofectin) may be added toenhance uptake. In addition, reconstituted virus envelopes have beensuccessfully used to deliver RNA and DNA to cells (see, for example,Arad et al. (1986) Biochem. Biophy. Acta. 859: 88-94).

For therapeutic use in vivo, the anti-sense oligonucleotide and/orpeptide nucleic acid sequences are administered to the individual in atherapeutically effective amount, for example, an amount sufficient toreduce or inhibit target protein expression in malignant cells. Theactual dosage administered may take into account whether the nature ofthe treatment is prophylactic or therapeutic in nature, the age, weight,health of the patient, the route of administration, the size and natureof the malignancy, as well as other factors. The daily dosage may rangefrom about 0.01 to 1,000 mg per day. Greater or lesser amounts ofoligonucleotide or peptide nucleic acid sequences may be administered,as required. As will be appreciated by those skilled in the medical art,particularly the chemotherapeutic art, appropriate dose ranges for invivo administration would be routine experimentation for a clinician. Asa preliminary guideline, effective concentrations for in vitroinhibition of the target molecule may be determined first.

4.B. Binding Protein-Based Therapeutics.

As mentioned above, a cancer marker protein or a protein that interactswith the cancer marker protein may be used as a target for chemotherapy.For example, a binding protein designed to bind the marker proteinessentially irreversibly can be provided to the malignant cells, forexample, by association with a ligand specific for the cell and known tobe absorbed by the cell. Means for targeting molecules to particularcells and cell types are well described in the chemotherapeutic art.

Binding proteins may be obtained and tested using technologies wellknown in the art. For example, the binding portions of antibodies may beused to advantage. It is contemplated, however, that intact antibodiesor BABS that have preferably been humanized may be used in the practiceof the invention. As used herein, the term “humanized” is understood tomean a process whereby the framework region sequences of a non-humanimmunoglobulin variable region are replaced by corresponding humanframework sequences. Accordingly, it is contemplated that such humanizedbinding proteins will elicit a weaker immune response than theirunhumanized counterparts. Particularly useful are binding proteinsidentified with high affinity for the target protein, e.g., greater thanabout 10⁹ M⁻¹. Alternatively, DNA encoding the binding protein may beprovided to the target cell as part of an expressible gene to beexpressed within the cell following the procedures used for gene therapyprotocols well described in the art. See, for example, U.S. Pat. No.4,497,796, and “Gene Transfer”, Vijay R. Baichwal, ed., (1986). It isanticipated that, once bound by binding protein, the target protein willbe inactivated or its biological activity reduced thereby inhibiting orretarding cell division.

As described above, suitable binding proteins for in vivo use may becombined with a suitable pharmaceutically-acceptable carrier, such asphysiological saline or other useful carriers well characterized in themedical art. The pharmaceutical compositions may be provided directly tomalignant cells, for example, by direct injection, or may be providedsystemically, provided the binding protein is associated with means fortargeting the protein to target cells. Finally, suitable dose ranges andcell toxicity levels may be assessed using standard dose rangeexperiments. Therapeutically-effective concentrations may range fromabout 0.01 to about 1,000 mg per day. As described above, actual dosagesadministered may vary depending, for example, on the nature of themalignancy, the age, weight and health of the individual, as well asother factors.

4.C Small Molecule-Based Therapeutics.

After having isolated breast cancer-associated proteins, the skilledartisan can, using methodologies well known in the art, screen smallmolecule libraries (either peptide or non-peptide based libraries) toidentify candidate molecules that reduce or inhibit the biologicalfunction of the breast cancer-associated proteins. The small moleculespreferably accomplish this function by reducing the in vivo expressionof the target molecule, or by interacting with the target moleculethereby to inhibit either the biological activity of the target moleculeor an interaction between the target molecule and its in vivo bindingpartner.

It is contemplated that, once the candidate small molecules have beenelucidated, the skilled artisan may enhance the efficacy of the smallmolecule using rational drug design methodologies well known in the art.Alternatively, the skilled artisan may use a variety of computerprograms which assist the skilled artisan to develop quantitativestructure activity relationships (QSAR) which further to assist thedesign of additional candidate molecules de novo. Once identified, thesmall molecules may be produced in commercial quantities and subjectedto the appropriate safety and efficacy studies.

It is contemplated that the screening assays may be automated therebyfacilitating the screening of a large number of small molecules at thesame time. Such automation procedures are within the level of skill inthe art of drug screening and, therefore, are not discussed herein.

Candidate peptide-based small molecules may be produced by expression ofan appropriate nucleic acid sequence in a host cell or using syntheticorganic chemistries. Similarly, non-peptidyl-based small molecules maybe produced using conventional synthetic organic chemistries well knownin the art.

As described above, for in vivo use, the identified small molecules maybe combined with a suitable pharmaceutically acceptable carrier, such asphysiological saline or other useful carriers well characterized in themedical art. The pharmaceutical compositions may be provided directly tomalignant cells, for example, by direct injection, or may be providedsystemically, provided the binding protein is associated with means fortargeting the protein to target cells. Finally, suitable dose ranges andcell toxicity levels may be assessed using standard dose rangeexperiments. As described above, actual dosages administered may varydepending, for example, on the nature of the malignancy, the age, weightand health of the individual, as well as other factors.

4.D. Methods for Monitoring the Status of Breast Cancer in an Individual

The progression of the breast cancer or the therapeutic efficacy ofchemotherapy may be measured using procedures well known in the art. Forexample, the efficacy of a particular chemotherapeutic agent can bedetermined by measuring the amount of a breast cancer-associated proteinreleased from breast cancer cells undergoing cell death. As reported inU.S. Pat. Nos. 5,840,503 and 5,965,376, soluble nuclear matrix proteinsand fragments thereof are released by cells upon cell death. Suchsoluble nuclear matrix proteins can be quantitated in a body fluid andused to monitor the degree or rate of cell death in a tissue. Similarly,the levels of one or more breast cancer-associated proteins could beused as an indication of the status of breast cancer in the individual.

For example, the concentration of a breast cancer-associated protein ora fragment thereof released from cells is compared to standards fromhealthy, untreated tissue. Fluid samples are collected at discreteintervals during treatment and compared to the standard. It iscontemplated that changes in the level of the breast cancer-associatedprotein, for example, will be indicative of the efficacy of treatment(that is, the rate of cancer cell death). It is contemplated that therelease of soluble, breast cancer-associated proteins can be measured inblood, plasma, urine, sputum, vaginal secretion, and breast exudate andother body fluids.

Where the assay is used to monitor tissue viability or progression ofbreast cancer, the step of detecting the presence and abundance of themarker protein or its transcript in samples of interest is repeated atintervals and these values then are compared, the changes in thedetected concentrations reflecting changes in the status of the tissue.For example, an increase in the level of one or more breastcancer-associated proteins may correlate with progression of the breastcancer. Where the assay is used to evaluate the efficacy of a therapy,the monitoring steps occur following administration of the therapeuticagent or procedure (e.g., following administration of a chemotherapeuticagent or following radiation treatment). Similarly, a decrease in thelevel of breast cancer-associated proteins may correlate with aregression of the breast cancer.

Thus, breast cancer may be identified by the presence of breastcancer-associated proteins as taught herein. Once identified, the breastcancer may be treated using compounds that reduce in vivo the expressionand/or biological activity of the breast cancer-associated proteins.Furthermore, the methods provided herein can be used to monitor theprogression and/or treatment of the disease. The following non-limitingexamples provide details of the isolation and characterization of breastcancer-associated proteins and methods for their use in the detection ofbreast cancer.

EXAMPLE 1 Identification of Breast Cancer Markers

To identify markers for breast cancer, the sera of individuals withbreast cancer were compared to the sera of normal individuals bysurface-enhanced laser desorption and ionization (SELDI) massspectrometry. Briefly, 0.5 mL aliquots of sera harvested from theindividuals were thawed. Then, 1 μL of a 1 mg/mL solution of soybeantrypsin inhibitor (SBTI) and 1 μL of a 1 mg/mL solution of leupeptinwere added to each aliquot. To remove lipids, 350 μL of1,1,2-trifluorotrichloroethane was added to each sample. The samplesthen were vortexed for five minutes and centrifuged in a microcentrifugefor five minutes at 4° C. The resulting supernatants were applied a 1 mLcolumn of agarose coupled to protein G (Hitrap Protein G column,Pharmacia and Upjohn, Peapack, N.J.) to remove immunoglobulin proteins.The column then was rinsed with 3 mL of 50 mM sodium phosphate, pH 7.0,with SBTI and leupeptin (“binding buffer”), and the resultingflowthrough applied directly to a 5 mL column of 6% Sepharose coupled toCibacron blue (Hitrap blue column, Pharmacia and Upjohn, Peapack, N.J.)to remove albumin proteins. The Hitrap blue column was rinsed with 20 mLof binding buffer. The resulting flowthrough was concentrated using fourcentrifugation-based concentrators with a 10 kD cutoff (Centricon 10,Millipore Corporation, Bedford, Mass.) to a final volume of about 0.7mL.

The resulting serum (substantially free of immunoglobulin and albumin)was subdivided into twelve fractions containing approximately equalamounts of protein by ion exchange chromatography. Specifically, theserum was applied to a Mono Q (Pharmacia and Upjohn, Peapack, N.J.) ionexchange column (a strong anion exchanger with quarternary ammoniumgroups) in 50 mM sodium phosphate buffer, pH 7.0 and proteins wereeluted from the column by increasing the concentration of sodiumchloride in a stepwise manner. Thus, the serum was divided into twelvefractions based on the concentration of sodium chloride used forelution. These fractions accordingly were designated flow through, 25mM, 50 mM, 75 mM, 100 mM, 125 mM, 150 mM, 200. mM, 250 mM, 300 mM, 400mM, and 2M sodium chloride. After elution, each fraction wasconcentrated to approximately 100 μg/mL and buffer exchanged intobinding buffer.

Then 4-10 μL from each of the twelve fractions were applied and allowedto bind to each of four SELDI chip surfaces, each surface holding up toeight samples. The intended location of each sample on the chip wasdemarcated with a circle drawn using a hydrophobic marker like thoseused in Pap smears. The SELDI chips used herein were purchased fromCiphergen Biosystems, Inc., Palo Alto, Calif., and used as describedbelow.

For copper or nickel surfaces, a chip containingethylenediaminetriacetic acid moieties (IMAC, Ciphergen Biosystems,Inc., Palo Alto, Calif.) was pretreated with two five-minuteapplications of five μL of a copper salt or nickel salt solution, andwashed with deionized water. After a five-minute treatment with five μLof binding buffer, two to three microliters of sample were applied tothe surface for thirty to sixty minutes. Another two to threemicroliters of sample were then applied for an additional thirty tosixty minutes. The chips then were washed twice with binding buffer toremove unbound proteins. 0.5 μL of sinapinic acid (12.5 mg/mL) was addedtwice and allowed to dry each time. The presence of sinapinic acidenhances the vaporization and ionization of the bound proteins upon massspectrometry.

For chip surfaces containing carboxyl moieties (WCX-2, CiphergenBiosystems, Inc., Palo Alto, Calif.), before use of the hydrophobic pen,the surface was washed with 10 mM HCl for thirty minutes and rinsed fivetimes with deionized water. After use of the pen, the surface was washedfive times with five μL of binding buffer and once with deionized water.Two to three μL of sample were applied in two applications of thirty tosixty minutes each. The surface was washed twice with 5 μL of bindingbuffer, and 0.5 μL of sinapinic acid were applied twice.

For chip surfaces containing quarternary ammonium moieties (SAX-2,Ciphergen Biosystems, Inc., Palo Alto, Calif.), after use of the pen,the surface was washed five times with five μL of binding buffer andonce with deionized water. Application of sample, washing, andapplication of sinapinic acid were done as described above.

The chips then were subjected to mass spectrometry utilizing a CiphergenSELDI PBS One (Ciphergen Biosystems, Inc., Palo Alto, Calif.) runningthe software program “SELDI v. 2.0”. For all chips, “high mass” was setto 200,000 Daltons, “starting detector sensitivity” was set to 9 (from arange of 1-10, with 10 being the highest sensitivity), NDF (neutraldensity filter) was set to “OUT”, data acquisition method was set to“Seldi Quantitation”, SELDI acquisition parameters were set to 20, withincrements of 5, and warming with two shots at intensity 50 (out of 100)was included. For IMAC chips, mass was optimized from 3000 Daltons to3001 Daltons, starting laser intensity was set to 80 (out of 100), andtransients set to 5 (i.e., 5 laser shots per site). Peaks wereidentified automatically by the computer. For WCX-2 chips, mass wasoptimized from 3,000 Daltons to 50,000 Daltons, starting laser intensitywas set to 80, and transients set to 8. Peaks were identifiedautomatically by the computer. For SAX-2 chips, mass was optimized from3,000 Daltons to 50,000 Daltons, starting laser intensity was set to 85,and transients set to 8. Peaks were identified automatically by thecomputer.

Ten serum samples (five from normal individuals and five fromindividuals with breast cancer) were analyzed by mass spectrometry toidentify the proteins present in the sixty fractions described above.The resulting peaks in the mass spectrometry trace were compared toidentify those peaks present in the serum samples from individuals withbreast cancer but not present in the normal samples. If peaks indifferent samples had a mass difference of no more than one percent, thepeaks were presumed to be the same. Eleven mass spectrometry peaksranging in size from just over 11,000 Da to approximately 103,000 Dawere identified as present in all five serum samples from individualswith breast cancer and in none of the samples from normal individuals.The presence or absence of these peaks was then determined for anadditional thirty serum samples (fifteen from normal individuals andfifteen from individuals with breast cancer). Seven other peaks thatwere present in four of the original five breast cancer serum samples,but not in any of the normal samples, were also analyzed because theywere present in the same fraction and on the same SELDI surface as oneor more of the eleven peaks already under evaluation. Of the eighteenpeaks studied, fifteen were present in fifteen or more of the twentybreast cancer serum samples, but absent from 15 or more of the normalserum samples.

The results of the foregoing analyses are summarized in Table 1. Themasses listed in the table are presumed accurate to within one percent.TABLE 1 Number of Mono Q Number of positive fraction positive samplesfrom (mM samples from individuals sodium SELDI chip individuals withwithout breast Mass (Da) chloride) surface used breast cancer cancer16210 0 (flow- Nickel 17 1 through) 17188  25 mM WCX-2 17 2 30183  25 mMWCX-2 15 3 34664  25 mM WCX-2 16 4 20050  50 mM Nickel 19 0 28258  50 mMNickel 20 0 24170  50 mM Nickel 17 0 35393  50 mM Nickel 17 3 34908  50mM WCX-2 16 2 70908 100 mM WCX-2 20 0 17840 100 mM WCX-2 18 2 11709 150mM SAX-2 20 0 42354 200 mM Nickel 17 0 56280 200 mM Nickel 16 0 34517400 mM Copper 18 1

EXAMPLE 2 Sequencing of Breast Cancer Marker Proteins

Breast cancer-associated proteins based upon the biochemical and massspectrometry data provided above may be better characterized usingwell-known techniques. For example, samples of the serum can befractionated using, for example, column chromatography and/orelectrophoresis, to produce purified protein samples corresponding toeach of the proteins identified in Table 1. The sequences of theisolated proteins can then be determined using conventional peptidesequencing methodologies (see Examples 5 and 6). It is appreciated thatthe skilled artisan, in view of the foregoing disclosure, would be ableto produce an antibody directed against any breast cancer-associatedprotein identified by the methods described herein. Moreover, theskilled artisan, in view of the foregoing disclosure, would be able toproduce nucleic acid sequences that encode the fragments describedabove, as well as nucleic acid sequences complementary thereto. Inaddition, the skilled artisan using conventional recombinant DNAmethodologies, for example, by screening a cDNA library with such anucleic acid sequence, would be able to isolate full length nucleic acidsequences encoding target breast cancer-associated proteins. Such fulllength nucleic acid sequences, or fragments thereof, may be used togenerate nucleic acid-based detection systems or therapeutics.

EXAMPLE 3 Production of Antibodies Which Bind Specifically to BreastCancer-Associated Proteins

Once identified, a breast cancer-associated protein may be detected in atissue or body fluid sample using numerous binding assays that are wellknown to those of ordinary skill in the art. For example, as discussedabove, a breast cancer-associated protein may be detected in either atissue or body fluid sample using an antibody, for example, a monoclonalantibody, which binds specifically to an epitope disposed upon thebreast cancer-associated protein. In such detection systems, theantibody preferably is labeled with a detectable moiety.

Provided below is an exemplary protocol for the production of ananti-breast cancer-associated monoclonal antibody. Other protocols alsoare envisioned. Accordingly, the particular method of producingantibodies to target proteins is not envisioned to be an aspect of theinvention.

Balb/c by J mice (Jackson Laboratory, Bar Harbor, Me.) are injectedintraperitoneally with the target protein every 2 weeks until theimmunized mice obtain the appropriate serum titer. Thereafter, the miceare injected with 3 consecutive intravenous boosts. Freund's completeadjuvant (Gibco, Grand Island) is used in the first injection,incomplete Freund's in the second injection; and saline is used forsubsequent intravenous injections. The animal then is sacrificed and itsspleen removed. Spleen cells (or lymph node cells) then are fused with amouse myeloma line, e.g., using the method of Kohler et al. (1975)Nature 256: 495. Hybridomas producing antibodies that react with thetarget proteins then are cloned and grown as ascites. Hybridomas arescreened by reactivity to the immunogen in any desirable assay. Detaileddescriptions of screening protocols, ascites production and immunoassaysalso are disclosed in PCT/US92/09220, published May 13, 1993.

EXAMPLE 4 Antibody-Based Assay for Detecting2 Breast Cancer in anIndividual

The following assay has been developed for tissue samples; however, itis contemplated that similar assays for testing fluid samples may bedeveloped without undue experimentation. A typical assay may employ acommercial immunodetection kit, for example, the ABC Elite Kit fromVector Laboratories, Inc.

A biopsy sample is removed from the patient under investigation inaccordance with the appropriate medical guidelines. The sample then isapplied to a glass microscope slide and the sample fixed in cold acetonefor 10 minutes. Then, the slide is rinsed in distilled water andpretreated with a hydrogen peroxide containing solution (2 mL 30% H₂O₂and 30 mL cold methanol). The slide then is rinsed in a Buffer Acomprising Tris Buffered Saline (TBS) with 0.1% Tween and 0.1% Brij. Amouse anti-breast cancer-associated protein monoclonal antibody inBuffer A is added to the slide and the slide then incubated for one hourat room temperature. The slide then is washed with Buffer A, and asecondary antibody (ABC Elite Kit, Vector Labs, Inc) in Buffer A isadded to the slide. The slide then is incubated for 15 minutes at 37° C.in a humidity chamber. The slides are washed again with Buffer A, andthe ABC reagent (ABC Elite Kit, Vector Labs, Inc.) is then added to theslide for amplification of the signal. The slide is then incubated for afurther 15 minutes at 37° C. in the humidity chamber.

The slide then is washed in distilled water, and a diaminobenzedine(DAB) substrate added to the slide for 4-5 minutes. The slide then isrinsed with distilled water, counterstained with hematoxylin, rinsedwith 95% ethanol, rinsed with 100% ethanol, and then rinsed with xylene.A cover slip is then applied to the slide and the result observed bylight microscopy.

EXAMPLE 5 Purification and Characterization of 28.3 kD Breast CancerProtein

The 28.3 kD breast cancer protein identified in Example 1 was isolatedand further characterized as follows.

Approximately 30 mL of serum (combined from multiple breast cancerpatients) was depleted of immunoglobulin G and serum albumin usingProtein G chromatography and Cibacron Blue agarose chromatography,respectively, using standard methodologies such as those described inExample 1. The albumin and immunoglobulin depleted serum then wasfractionated by Mono Q ion-exchange affinity chromatography. Briefly,the serum proteins were applied to a 5 mL Mono Q column (Pharmacia andUpjohn, Peapack, N.J.) in 50 mM sodium phosphate buffer, pH 7.0, and theflow through fraction collected. Thereafter, the serum proteins wereeluted stepwise from the column using 50 mM sodium phosphate buffer, pH7.0 containing increasing concentrations of sodium chloride. In thismanner, 12 serum fractions were obtained, each containing a differentamount of sodium chloride. The fractions included flow through, andelution buffers of 50 mM sodium phosphate buffer, pH 7.0 containing 25mM, 50 mM, 75 mM, 100 mM, 125 mM, 150 mM, 200 mM, 250 mM, 300 mM, 400mM, and 2M sodium chloride.

The 50 mM sodium chloride fraction containing the protein of interestwas subsequently buffer exchanged back into 50 mM sodium phosphatebuffer, pH 7.0 and concentrated by means of a Centricon 10 (Millipore)in accordance with the manufacturer's instructions. The resulting samplethen was fractionated by size exclusion chromatography on a SephacrylS-200 column (Pharmacia) using an isocratic buffer containing 100 mMsodium phosphate, 150 mM NaCl, pH 7.4. Fractions that eluted from thecolumn were evaluated for the presence of the 28.3 kD protein using theCiphergen SELDI mass spectroscopy as described in Example 1. Fractionscontaining the 28.3 kD protein were pooled and applied to an IMAC column(Sigma) which had been pre-loaded with Ni²⁺ by prior incubation with 50mM NiCl₂. The IMAC column then was washed with 6 bed volumes of asolution containing 100 mM sodium phosphate, 150 mM NaCl, pH 7.4, andthe bound protein fraction eluted with the same solution containing 100mM imidazole. The eluted fraction then was concentrated by means of aMinicon 10 (Millipore) and then was fractionated by sodium dodecylsulfate-polyacrylamide gel electrophoresis (SDS-PAGE) on a 12% Trisglycine SDS-PAGE gel. Samples of the protein fraction were applied totwo separate lanes of the gel. After electrophoresis, the resulting gelthen was stained with Coomassie Brilliant Blue dye and destained toreveal the presence of proteins. Three bands of about 28.3 kD(characterized as the heaviest molecular weight protein, the mediummolecular weight protein, and the lightest molecular weight protein)were excised from one of the 2 lanes and were eluted from the acrylamideslices.

The proteins were eluted from the gel as follows. Briefly, the gelslices were washed five times with HPLC grade water with vigorousvortexing. The washed slices then were cut into small pieces in 120 μLof 100 mM sodium acetate pH 8.5, 0.1% SDS and incubated overnight at 37°C. The supernatant was decanted into a fresh tube and dried in aspeedvac. The resulting pellet then was reconstituted in 37 μL HPLCgrade water. Approximately 1480 μL of cold ethanol then was added andthe resulting mixture incubated overnight at −20° C. The sample wascentrifuged at 4° C. for 15 minutes at 11,000 rpm. The supernatant wasremoved and the resulting pellet reconstituted in 5 μL of water. Theresulting protein solutions were run on the SELDI and the 28.3 kDprotein was identified in one of the three preparations (see FIG. 1Awhich corresponds to the heaviest 28 kD protein). The corresponding bandthen was excised from the second of the 2 lanes on the gel. Afterproteolysis with trypsin, the tryptic fragments were eluted from the geland submitted for microsequence analysis via mass spectrometry.

Four individual masses were detected by mass spectrometry. When the fourmasses were used to search the Swiss Protein Database, all four masseswere found to match amino acid sequences present in the protein referredto in the art as U2 small nuclear ribonucleoprotein B″ (U2 snRNP B″)(Habets et al. (1987) supra, Swiss Protein Database Accession Number4507123). The results are summarized in Table 2. TABLE 2 SEQ ID PeptideSequence NO Protein 1 QLQGFPFYGKPMR 1 U2 snRNP B″ 2 HDIAFVEFENDGQAGAAR 2U2 snRNP B″ 3 LVPGRHDIAFVEFENDGQAGAAR 3 U2 snRNP B″ 4 TVEQTATTTNK 4 U2snRNP B″

The amino acid sequence, in an N- to C-terminal direction, of the U2SnRNP B″ protein in single amino acid code is: (SEQ ID NO: 5) MDIRPNHTIYINNMNDKIKK EELKRSLYAL FSQFGHVVDI VALKTMKMRG QAFVIFKELG SSTNALRQLQGFPFYGKPMR IQYAKTDSDI ISKMRGTFAD KEKKKEKKKA KTVEQTATTT NKKPGQGTPNSANTQGNSTP NPQVPDYPPN YILFLNNLPE ETNEMMLSML FNQFPGFKEV RLVPGRHDIAFVEFENDGQA GAARDALQGF KITPSHAMKI TYAKK

EXAMPLE 6 Purification and Characterization of 71 kD Breast CancerProtein

The 71 kD breast cancer protein identified in Example 1 was isolated andfurther characterized as follows.

50 mL of serum from each of four individuals was pooled to give a singlealiquot of 200 mL. This 200 mL aliquot was subdivided into six aliquotsof 33 mL each. Each aliquot was treated with 19 mL oftrifluorotrichloroethane as described in Example 1. Each aliquot wasapplied to Protein G and Cibacron Blue columns as described inExample 1. Fractions containing protein in the flowthrough(approximately 500 mL/aliquot) were pooled and concentrated toapproximately 10 mL/aliquot (60 mL total) using Centricon concentrators.

3 mL aliquots were loaded onto 5 mL mono Q sepharose columns (60mL/3mL=20 aliquots). Fractionation was performed as described in Example1, except that all volumes were multiplied by 5. The fractions elutedwith 100 mM sodium chloride from each fractionation were pooled into asingle 200 mL fraction and buffer exchanged into binding buffer asdescribed in Example 1.

The 200 mL fraction was applied to a series of antibody columns toremove abundant proteins of 50-70 kD. Each of these proteins, alpha-Ianti-trypsin, ceruloplasmin, kallikrein, and GC-globulin, had beenidentified and sequenced during preliminary attempts to isolate the 71kD protein. Commercial antibodies to each of the proteins were purchasedand coupled to a solid support (agarose) using conventional NHS esterchemistry (Pierce Aminolink Plus kit—part number 44894). The 200 mLfraction was applied to each column in turn until the protein inquestion could no longer be seen in the flowthrough by Western blotanalysis.

The flowthrough was subjected to size exclusion chromatography using anS200 column. Fractions containing the 71 kD) peak were identified bySELDI as described in Example 1. Because these fractions also appearedto contain a fragment of human serum albumin (HSA) that would not bindto the Cibacron blue column, the fractions were applied to an HSAaffinity column with two murine antibodies to HSA to depelete theremaining HSA from the sample. SDS-PAGE analysis of the sample revealeda single band in the 71 kD range by silver staining. The remainingsample was divided into two aliquots and run on two lanes of a 10%tris-glycine gel. The gel was stained with Coomassie Brilliant Blue dye.The 71 kD band from one of the two lanes was excised and eluted from thegel as described in Example 5. Its identity as the 70.972 kD markerprotein was confirmed by SELDI. The 71 kD band from the other lane wasexcised and treated with trypsin. The resulting peptides were elutedfrom the gel and subjected to microsequence analysis by massspectrometry. Sixteen of the predicted trypsin fragments of the 64-kDsubunit of cleavage stimulation factor have masses corresponding tothose identified in the mass spectrum of the 71 kD protein. The sixteensequences are set forth in Table 3. Two reported sequences for cleavagestimulation factor are set forth in the Sequence Listing as SEQ ID NO:22and SEQ ID NO:23. TABLE 3 SEQ ID Peptide Sequence NO Protein 1 GQVPMQDPR6 Cleavage Stimulation Factor 2 GSLPANVPTPR 7 Cleavage StimulationFactor 3 GLLGDAPNDPR 8 Cleavage Stimulation Factor 4 AGLTVRDPAVDR 9Cleavage Stimulation Factor 5 ALRVDNAASEKNK 10 Cleavage StimulationFactor 6 GGTLLSVTGEVEPR 11 Cleavage Stimulation Factor 7 DIFSEVGPVVSFR12 Cleavage Stimulation Factor 8 GIDARGMEARAMEAR 13 Cleavage StimulationFactor 9 GMEARAMEARGLDAR 14 Cleavage Stimulation Factor 10AVASLPPEQMFELMK 15 Cleavage Stimulation Factor 11 AMEARAMEVRGMEAR 16Cleavage Stimulation Factor 12 GYLGPPHQGPPMHHVPGHESR 17 CleavageStimulation Factor 13 GPIPSGMQGPSPINMGAVVPQGSR 18 Cleavage StimulationFactor 14 NMLLQNPQLAYALLQAQVVMR 19 Cleavage Stimulation Factor 15GGPLPEPRPLMAEPRGPMLDQR 20 Cleavage Stimulation Factor 16SLGTGAPVIESPYGETISPEDAPESISK 21 Cleavage Stimulation Factor

Equivalents

The invention may be embodied in other specific forms without departingfrom the spirit or essential characteristics thereof. The foregoingembodiments are therefore to be considered in all respects illustrativerather than limiting on the invention described herein. Scope of theinvention is thus indicated by the appended claims rather than by theforegoing description, and all changes that come within the meaning andrange of equivalency of the claims are intended to be embraced byreference therein.

Incorporation By Reference

The entire disclosure of each of the aforementioned patent andscientific documents cited hereinabove is expressly incorporated byreference herein.

1-42. (canceled)
 43. A method of diagnosing breast cancer in a mammal,the method comprising the steps of: (a) obtaining a sample isolated fromthe mammal; and (b) detecting in the sample the presence or absence of aprotein characterized as comprising an amino acid sequence selected fromthe group consisting of SEQ ID NO:6; SEQ ID NO:7; SEQ ID NO:8; SEQ IDNO:9; SEQ ID NO:10; SEQ ID NO: 1; SEQ ID NO:12; SEQ ID NO:13; SEQ IDNO:14; SEQ ID NO:15; SEQ ID NO:16; SEQ ID NO:17; SEQ ID NO:18; SEQ IDNO:19; SEQ ID NO:20; SEQ ID NO:21; SEQ ID NO:22; and SEQ ID NO:23,wherein the presence of the protein is indicative of the presence ofbreast cancer in the mammal, and wherein the absence of the protein isindicative of the absence of breast cancer in the mammal.
 44. (canceled)45. The method of claim 43 or 70, wherein the sample comprises breasttissue.
 46. The method of claim 43 or 70, wherein the sample comprises abody fluid.
 47. The method of claim 46, wherein the body fluid isselected from the group consisting of blood, serum, plasma, sweat,tears, urine, peritoneal fluid, lymph, vaginal secretions, semen, spinalfluid, ascitic fluid, saliva, sputum, and breast exudate.
 48. A methodof diagnosing breast cancer in a mammal, the method comprising the stepsof: (a) contacting a sample derived from the mammal with a bindingmoiety that binds specifically to a protein comprising an amino acidsequence of SEQ ID NO:23, thereby to produce a complex; and (b)detecting the presence or absence of the complex, wherein the presenceof the complex is indicative of the presence of breast cancer in themammal and wherein the absence of the complex is indicative of theabsence of breast cancer in the mammal.
 49. (canceled)
 50. The method ofclaim 48 or 71, wherein the binding moiety is selected from the groupconsisting of an antibody, an antibody fragment and a biosyntheticantibody binding site.
 51. The method of claim 48 or 71, wherein thebinding moiety is an antibody.
 52. The method of claim 51, wherein theantibody is a monoclonal antibody.
 53. The method of claim 50, whereinthe binding moiety is labeled with a detectable moiety.
 54. The methodof claim 48, wherein the absence of a detectable amount of the complexis indicative of the absence of breast cancer. 55-62. (canceled)
 63. Themethod of claim 46, wherein the body fluid is serum.
 64. The method ofclaim 48 or 71, wherein the sample comprises breast tissue.
 65. Themethod of claim 48 or 71, wherein the sample comprises a body fluid. 66.The method of claim 65, wherein the body fluid is selected from thegroup consisting of blood, serum, plasma, sweat, tears, urine,peritoneal fluid, lymph, vaginal secretions, semen, spinal fluid,ascitic fluid, saliva, sputum, and breast exudate.
 67. The method ofclaim 65, wherein the body fluid is serum.
 68. The method of claim 43,wherein the presence of a detectable amount of the protein is indicativeof the presence of breast cancer in the mammal.
 69. The method of claim43, wherein the absence of a detectable amount of the protein isindicative of the absence of breast cancer in the mammal.
 70. A methodof diagnosing breast cancer in a mammal, the method comprising the stepof: determining whether a protein comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:6; SEQ ID NO:7; SEQ IDNO:8; SEQ ID NO:9; SEQ ID NO:10; SEQ ID NO:11; SEQ ID NO:12; SEQ IDNO:13; SEQ ID NO:14; SEQ ID NO:15; SEQ ID NO:16; SEQ ID NO:17; SEQ IDNO:18; SEQ ID NO:19; SEQ ID NO:20; SEQ ID NO:21; SEQ ID NO:22; and SEQID NO:23 is present in a sample derived from the mammal in an amountgreater than or equal to a threshold value indicative of the presence ofbreast cancer in the mammal, wherein an amount of protein greater thanor equal to the threshold value is indicative of the presence of breastcancer in the mammal and an amount of protein less than the thresholdvalue is indicative of the absence of breast cancer in the mammal.
 71. Amethod of diagnosing breast cancer in a mammal, the method comprisingthe steps of: (a) contacting a sample from the mammal derived from themammal with a binding moiety that binds specifically to a proteincomprising an amino acid sequence of SEQ ID NO:23, thereby to produce acomplex; and (b) determining whether the complex is present in an amountgreater than or equal to a threshold value indicative of the presence ofbreast cancer in the mammal, wherein an amount greater than or equal tothe threshold value is indicative of the presence of breast cancer inthe mammal and an amount less than the threshold value is indicative ofthe absence of breast cancer in the mammal.